Here we model an intervention to address malnutrition among school-aged children in urban Hanoi. The model follows the outline developed in a workshop in July 2023. A test run of the intervention will be carried out by the Center for the Development of Organic Agriculture (CODAS) under the Association of Organic Agriculture of Vietnam.
The gardens will be designed for teaching about nutrition and STEM (an approach to learning and development that integrates science, technology, engineering and maths) at primary and secondary schools. In the 2nd year the garden is expected to start running well. The 3rd year is when the education plan will be fully running.
We use several different functions to create a working model of the garden intervention.
The chance_event function generates conditional values
with:
\[ occurrence = \begin{cases} rbinom(length(value\_if), 1, chance) & if !one\_draw \\ rbinom(1, 1, chance) & if one\_draw \end{cases} \]
and final values with:
\[ chance\_event = occurrence \cdot value\_if + (1 - occurrence) \cdot value\_if\_not \]
source("functions/vv.R")
The vv function generates synthetic data with varying
means and coefficients of variation with:
These are used to calculate annual means:
If both absolute and relative trends are absent, the annual means remain constant:
\[ annual\_means = var\_mean \]
If an absolute trend is present, annual means are adjusted linearly:
\[ annual\_means = var\_mean + absolute\_trend \cdot (0, 1, \ldots, n-1) \]
If only a relative trend is present, annual means change multiplicatively:
\[ annual\_means = var\_mean \cdot (1 + relative\_trend / 100)^{(0, 1, \ldots, n-1)} \]
Generate random data points based on the specified distribution using calculated annual means:
\[ out \sim \mathcal{N}(annual\_means, \lvert annual\_means \rvert \cdot \frac{var\_CV}{100}) \]
If specified, limit data points to the defined lower and upper limits:
\[ out = \max(out, lower\_limit), \quad out = \min(out, upper\_limit) \]
source("functions/discount.R")
The discounted value of a time series value \(x_t\) at time \(t\) can be calculated using the formula:
\[ \text{Discounted Value}_t = \frac{x_t}{\left(1 + \frac{\text{Discount Rate}}{100}\right)^{t-1}} \]
Where:
\(x_t\) is the value at time \(t\),
\(\text{Discount Rate}\) is the discount rate provided to the function,
\(t\) is the time period (starting from 1).
Net Present Value (NPV) (if
calculate_NPV is set to TRUE): The Net Present
Value of a time series with discounted values can be calculated as the
sum of all discounted values:
\[ \text{NPV} = \sum_{t=1}^{n} \frac{x_t}{\left(1 + \frac{\text{Discount Rate}}{100}\right)^{t-1}} \]
Where \(n\) is the number of time periods in the time series.
source("functions/mcSimulation.R")
## The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
## which was just loaded, will retire in October 2023.
## Please refer to R-spatial evolution reports for details, especially
## https://r-spatial.org/r/2023/05/15/evolution4.html.
## It may be desirable to make the sf package available;
## package maintainers should consider adding sf to Suggests:.
## The sp package is now running under evolution status 2
## (status 2 uses the sf package in place of rgdal)
##
## Attaching package: 'decisionSupport'
## The following objects are masked _by_ '.GlobalEnv':
##
## discount, vv
## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
The mcSimulation function performs Monte Carlo
simulations to estimate model outputs based on provided parameters and a
model function.
The Monte Carlo simulation process generates a set of estimated model outputs based on random input samples, providing a distribution of potential outcomes including:
random input generation with \(x_1, x_2, \ldots, x_n\) as a set of \(n\) random input samples drawn from a given distribution, representing the input parameters of the model.
a model function \(f(x)\) that maps the input data \(x\) to the corresponding model output \(y\). This can be expressed as:
\[ y_i = f(x_i), \quad i = 1, 2, \ldots, n \]
Estimated model outputs \(\hat{y}_1, \hat{y}_2, \ldots, \hat{y}_n\) obtained by applying the model function to the random input samples:
\[ \hat{y}_i = f(x_i), \quad i = 1, 2, \ldots, n \]
# Source our model
source("CODAS_Garden_Model.R")
# Ensure consistent results with the random number generator
# not for each 'run' of the MC simulation but for
# consistency each time we call on the simulation
set.seed(1234)
garden_simulation_results <- mcSimulation(
estimate = estimate_read_csv("inputs_school_garden.csv"),
model_function = school_garden_function,
numberOfModelRuns = 1000, #run 1000 times
functionSyntax = "plainNames"
)
The way we present NPV values can influence decision makers. The same information presented in different ways can even lead to different decisions. Here we derive a plot that compares the decision options as pure NPV values.
source("functions/plot_distributions.R")
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.2 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
plot_distributions(mcSimulation_object = garden_simulation_results,
vars = c("NPV_garden", "NPV_no_garden"),
method = 'hist_simple_overlay',
base_size = 7,
x_axis_name = "Comparative NPV outcomes")
Under Prospect Theory the way we present NPV values can influence decision makers - the same information presented in different ways can lead to different decisions. For example, framing a projected NPV gain as a “reduction in potential loss” might make it more attractive to decision makers due to loss aversion.
Here we plot the distribution for the decision and frame the
projected NPV gain for the ‘decision’ (distributions for the two options
with the NPV values of the no garden option subtracted from
those for the garden). If we display this as a “Reduction
in potential loss” it might be more attractive to decision makers due to
loss aversion, i.e. school boards might put more emphasis on avoiding
potential losses than on seeking gains. We can frame our results as a
strategy that minimizes losses rather than one that maximizes gains.
plot_distributions(mcSimulation_object = garden_simulation_results,
vars = "decision",
method = 'hist_simple_overlay',
base_size = 7,
x_axis_name = "Reduction in potential loss")
Use gt_plt_summary() from {gtExtras}
library(gtExtras)
library(svglite)
# Subset the outputs from the mcSimulation function (y) to summarize only on the variables that we want.
mcSimulation_summary <- data.frame(garden_simulation_results$x[2:55],
garden_simulation_results$y[1:3])
gt_plt_summary(mcSimulation_summary)
| mcSimulation_summary | ||||||
| 1000 rows x 57 cols | ||||||
| Column | Plot Overview | Missing | Mean | Median | SD | |
|---|---|---|---|---|---|---|
| discount_rate | 0.0% | 6.5 | 6.5 | 0.9 | ||
| size_of_garden | 0.0% | 75.2 | 75.2 | 14.9 | ||
| CV_value | 0.0% | 0.3 | 0.3 | 0.1 | ||
| inflation_rate | 0.0% | 7.5 | 7.5 | 1.5 | ||
| if_students_like | 0.0% | 0.6 | 0.6 | 0.1 | ||
| if_parents_like | 0.0% | 0.7 | 0.7 | 0.1 | ||
| if_community_likes | 0.0% | 0.5 | 0.5 | 0.2 | ||
| if_effective_manage | 0.0% | 0.6 | 0.6 | 0.1 | ||
| if_garden_yield_enough | 0.0% | 0.5 | 0.5 | 0.1 | ||
| if_garden_healthy | 0.0% | 0.7 | 0.7 | 0.1 | ||
| if_teachers_like | 0.0% | 0.5 | 0.5 | 0.2 | ||
| if_effective_teaching | 0.0% | 0.6 | 0.6 | 0.2 | ||
| if_effective_training | 0.0% | 0.5 | 0.5 | 0.2 | ||
| if_offer_green_space | 0.0% | 0.7 | 0.7 | 0.1 | ||
| if_reduce_polution | 0.0% | 0.3 | 0.3 | 0.1 | ||
| if_biophysical_good | 0.0% | 0.3 | 0.3 | 0.1 | ||
| equipment_cost | 0.0% | 74.4 | 74.8 | 15.6 | ||
| construction_cost | 0.0% | 22.6 | 22.6 | 4.8 | ||
| garden_designing_costs | 0.0% | 12.5 | 12.5 | 1.6 | ||
| teacher_training_cost | 0.0% | 12.5 | 12.7 | 4.5 | ||
| school_board_planning | 0.0% | 9.0 | 9.0 | 1.8 | ||
| teaching_equipment | 0.0% | 7.6 | 7.5 | 1.5 | ||
| compost_starting | 0.0% | 7.5 | 7.5 | 1.5 | ||
| worm_starting | 0.0% | 3.4 | 3.4 | 0.9 | ||
| livestock_costs | 0.0% | 3.5 | 3.5 | 0.9 | ||
| if_family_pays_establishment | 0.0% | 0.3 | 0.4 | 0.1 | ||
| establishment_family_portion_paid | 0.0% | 0.3 | 0.3 | 0.1 | ||
| maintaining_labor | 0.0% | 32.6 | 32.4 | 4.6 | ||
| teacher_salary_cost | 0.0% | 24.9 | 25.0 | 3.1 | ||
| teaching_equipment_annual | 0.0% | 7.6 | 7.6 | 1.5 | ||
| teaching_tools | 0.0% | 3.5 | 3.5 | 0.9 | ||
| seed_costs | 0.0% | 1.5 | 1.5 | 0.3 | ||
| fertilizer | 0.0% | 1.5 | 1.5 | 0.3 | ||
| plant_protection | 0.0% | 3.5 | 3.6 | 0.9 | ||
| livestock_maint | 0.0% | 6.0 | 6.1 | 2.4 | ||
| annual_teacher_training | 0.0% | 4.0 | 4.0 | 0.6 | ||
| if_school_has_canteen | 0.0% | 0.3 | 0.3 | 0.1 | ||
| canteen_savings | 0.0% | 7.5 | 7.5 | 1.6 | ||
| sale_of_yield | 0.0% | 20.1 | 20.1 | 6.0 | ||
| extra_cirricular_savings | 0.0% | 60.9 | 61.6 | 24.4 | ||
| formal_edu_savings | 0.0% | 60.0 | 61.0 | 24.4 | ||
| outside_investment_value | 0.0% | 174.0 | 124.5 | 166.7 | ||
| increased_enrollment_value | 0.0% | 53.0 | 51.2 | 27.0 | ||
| if_increase_tuition | 0.0% | 0.2 | 0.2 | 0.1 | ||
| tuition_increase | 0.0% | 7.4 | 7.5 | 1.5 | ||
| child_veg_access | 0.0% | 7.5 | 7.5 | 1.6 | ||
| child_healthier_choices | 0.0% | 293.5 | 291.4 | 119.0 | ||
| green_space_value | 0.0% | 150.2 | 149.9 | 30.9 | ||
| reduce_polution_value | 0.0% | 14.8 | 14.7 | 3.2 | ||
| school_event_value | 0.0% | 29.8 | 29.6 | 12.8 | ||
| value_of_non_garden_land_use | 0.0% | 35.0 | 34.8 | 9.0 | ||
| if_parking | 0.0% | 0.1 | 0.1 | 0.0 | ||
| parking_value | 0.0% | 150.2 | 149.1 | 31.2 | ||
| costs_of_non_garden_land_use | 0.0% | 3.0 | 3.0 | 1.2 | ||
| NPV_garden | 0.0% | 1,188.6 | 1,103.0 | 545.9 | ||
| NPV_no_garden | 0.0% | 324.6 | 246.1 | 271.5 | ||
| decision | 0.0% | 864.0 | 842.6 | 603.0 | ||
Summary of the savings
summary(garden_simulation_results$y$decision)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -840.7 488.7 842.6 864.0 1209.1 3778.0
Summary of costs
summary(garden_simulation_results$y$total_costs)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 519.1 611.1 643.8 642.6 672.7 834.9
summary(garden_simulation_results$y$Cashflow_garden1)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -135.75 24.15 82.87 98.77 156.99 687.72
source("functions/plot_cashflow.R")
plot_cashflow(mcSimulation_object = garden_simulation_results,
cashflow_var_name = "Cashflow_garden")
# Subset the outputs from the mcSimulation function (y) by selecting the correct variables be sure to run the multi_EVPI only on the variables that the we want.
mcSimulation_table <- data.frame(garden_simulation_results$x,
garden_simulation_results$y[1:3])
source("functions/multi_EVPI.R")
evpi <- multi_EVPI(mc = mcSimulation_table, first_out_var = "NPV_garden")
## [1] "Processing 3 output variables. This can take some time."
## [1] "Output variable 1 (NPV_garden) completed."
## [1] "Output variable 2 (NPV_no_garden) completed."
## [1] "Output variable 3 (decision) completed."
source("functions/plot_evpi.R")
plot_evpi(evpi, decision_vars = "decision")
## Warning: There are no variables with a positive EVPI. You probably do not need
## a plot for that.
Projection to Latent Structures model can give us some sense of the correlation strength and direction for model variables and our outcome variables.
source("functions/pls_model.R")
pls_result <- pls_model(object = garden_simulation_results,
resultName = names(garden_simulation_results$y)[1],
ncomp = 1)
input_table <- read.csv("inputs_school_garden.csv")
source("functions/plot_pls.R")
plot_pls(pls_result, input_table = input_table, threshold = 0)
The full repository can be accessed with the following QR code.